Optimal reconstruction of the folding landscape using differential energy surface analysis 
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In experiments and in simulations, the free energy of a state of a system can be determined from the probability 
that the state is occupied. However, it is often necessary to impose a biasing potential on the system so that 
high energy states are sampled with a reasonable frequency. The unbiased energy is typically obtained from 
the data using the weighted histogram analysis method (WHAM). Here we present differential energy surface 
analysis (DESA), in which the gradient of the energy surface, dE/dx, is extracted from data taken with a 
series of harmonic biasing potentials. It is shown that DESA produces a maximum likelihood estimate of the 
folding landscape gradient. DESA is used to analyze extension vs. time data taken as an optical trap is used to 
unfold a DNA hairpin under a harmonic constraint. It is shown that the energy surface obtained from DESA is 
indistinguishable from the energy surface obtained when WHAM is applied to the same data. Two criteria are 
defined which indicate whether the DESA results are self-consistent. It is found that these criteria are satisfied 
for the experimental data analyzed, confirming that data taken with different constraint origins are sampling 
the same effective energy surface. The combination of DESA and the optical trap assay in which a structure 
is disrupted under harmonic constraint facilitates an extremely accurate measurement of the folding energy 
surface. 

PACS numbers: 87.14.G-,87.15.Cc,87.15.hm 



PQ 

d 



CO 

O 

CN 



X 



I. INTRODUCTION 

In a dynamical system driven by thermal fluctuations the 
effective energy E as a function of conformation x is related 
to the probability p that the conformation is observed by the 
Boltzmann formula, 



p(x) = exp 



-£(x) 



(1) 



where k B is the Boltzmann constant and T is the temperature. 
The conformation of a simple system may be specified by a 
small number of variables. However, in studies of the folding 
of bio-polymers the conformational space of the system has 
many degrees of freedom. In some cases, such systems can be 
described in terms of a single reaction coordinate, x, and the 
dynamics of the system can be modeled by diffusion in this ID 
space under the influence of an effective energy! 1-4]. In nu- 
merical simulations the reaction coordinate may be the radius 
of gyration of the structure, the fraction of native contacts, or 
another measure of the level of compaction or organization of 
the molecule. In single molecule manipulation experiments 
the end-to-end extension of the molecule is typically used as 
a reaction coordinate! 5]. The energy as a function of the reac- 
tion coordinate follows from Eq.[TJas 



E(x) = -k B Tln(p(x)) +c. 



(2) 



The arbitrary constant c is included because the energy of a 
system is only defined up to a additive constant. Although 
this formula can be used, in principle, to determine the en- 
ergy surface from the probability density function, this is only 
practical when the energy varies in a range which is narrow 
compared with k&T. The exponential dependence of the prob- 
ability density on E means that states with relative energy that 
is large compared with k B T will be impossible to sample in a 
finite time. 



One solution to this problem is to apply an external force 
field to the system which tends to bias it towards the regions 
of the reaction coordinate that would otherwise be poorly sam- 
pled. Often, this takes the form of a harmonic constraint, 
which adds an additional term a(x ~ xo) 2 /2 to the energy, 
where a is the effective stiffness and xo is the origin of the 
constraint. By selecting an appropriate value of a and vary- 
ing xo, the system can be forced to visit various regions of 
the reaction coordinate, allowing more uniform convergence 
of statistics. This technique, often referred to as umbrella 
sampling^, is widely used in simulations|01, and has been 
applied to single molecule experiments {U. 

We can still apply Eq.|2]to the system with a specific config- 
uration of the harmonic constraint, but we will obtain a biased 
energy which is the sum of the intrinsic energy and the energy 
of the constraint. To find the unbiased energy, we subtract the 
known constraint energy, and obtain 



Ej{x) = -k B T\n(pj(x)) - \a(x 



(3) 



For each position of the constraint xj we obtain a measure- 
ment of the energy surface Ej (x) over the region visited by 
the system. Each local energy surface Ej (x) contains an in- 
dependent constant Cj . 

If we wish to find the global energy surface, defined over 
the entire domain of x, we need to choose the constants Cj 
and combine the local energy landscapes Ej(x) in a self- 
consistent manner. If there is substantial overlap between the 
domains of the local landscapes, the constants Cj can be de- 
termined by requiring that the energy surfaces corresponding 
to different constraints are consistent in the overlap regions. 

The weighted histogram analysis method (WHAM) has 
been formulated to reconstruct the energy surface E(x) from 
Monte Carlo or molecular dynamics simulations with arbi- 
trary biasing potentials|9- 12]. The method provides an op- 
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timal estimate for the unbiased probability density p(x), 



p(x) 



]pi(x)wi{x), 



(4) 



where pi (x) is the probability density sampled in biased simu- 
lation i and the summation is over all simulations. In the case 
of a harmonic constraint centered at Xi, the weights Wi(x) are 
given by 



Wi(x) = 



M; 



E M i exp 



a(x — Xj) 
2k B T 



(5) 



where Mj is the total number of measurements in simulation 
i. The constants /,; are defined implicitly by a system of non- 
linear equations, 



exp(-/i) 



dx exp 

E 

i 



a(x — x^' 



2k B T 



dx 



E M < 



feexp 



fk 



a(x - x k y 
2k B T 



(6) 
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where the histogram count Hj (x) is the number of measure- 
ments between x and x + dx in system j, and is related to the 
probability by pj(x)dx — Hj(x)/Mj. 

In the following section, we describe another method of ob- 
taining the global energy surface which we call differential 
energy surface analysis (DESA). In DESA, we consider the 
slope of the energy landscape, dE/dx rather than the energy 
itself. Differentiating Eq.[3]with respect to x we obtain 



dE 

dx 



Ux) 



-kBT— [\n(pj(x))) - a(x - Xj). 
dx 



(7) 



FIG. 1 . The extension of a DNA hairpin as a function of time as the 
constraint origin is moved. In (a) the hairpin is initially closed and 
the constraint moves at 25 nm/s. In (b) the hairpin is initially open 
and the constraint moves at —25 nm/s. 



Xi is given by 



dE v Ejgjfo^Qci) 



(8) 



An important feature of this equation is that the constants where the summation is over the constraint origins, Xj , and 



Cj are eliminated, so that it is not necessary to find a self- 
consistent solution to obtain the global function dE/dx. At 
any given point x along the landscape, dE/dx can be ob- 
tained by averaging dEj / dx obtained from the system at var- 
ious constraint origins. 



H. DESCRIPTION OF DESA 

In order to define the method of differential energy land- 
scape analysis, we assume a thermally driven system with 
one reaction coordinate x which is characterized by an en- 
ergy function E(x). We assume that the dynamics of the sys- 
tem are measured in the presence of a harmonic constraint of 
stiffness a for N distinct constraint origins Xj. For each Xj, 
the time series of x is used to compile a histogram Hj(xi) 
containing Mj total samples. We assume that the histogram 
binning is consistent for all Xj, and that the values of a and Xj 
are chosen so that there is significant overlap between the his- 
tograms. The slope of the energy landscape dE/dx at position 



is defined by Eq. [7] Interpreting this formula, the 
value of dE/dx at position Xi is a weighted average of 
dEj / dx found from the N systems with constraint origins Xj . 
Using pj(xi)Ax — Hj(xi)/Mj,we can express Eq.[8]entirely 
in terms of histogram counts, as 

dx 



EjHjixi) 



(9) 



This formula has been used to reconstruct energy landscapes 
of molecular dynamics simulations lfl3ll . and experimental 
data|8]. We will show below that Eq.|9]gives an optimal esti- 
mation of dE/dx. 

When determining the mean value of a variable from data 
points which have differing uncertainty, the maximum likeli- 
hood solution is 



_ _ E» W » a » 2 _ 1 

E w < Ei w i 



\, (10) 
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FIG. 2. The biased energy of the hairpin, as calculated from data in 
intervals 9 and 10 of Fig.QJa) using Eq.|2] 



and Hj(xi + i), and using Eq. [131 approximate the uncertainty 

- dEj 

in -j — as 

cl a; 



AXy/Hj(Xi) 



(14) 



The statistical weight required for maximum likelihood is 
therefore 



1 



Wj{Xi) = — 



Ax \ Hj(xi) 



fc R T 



(15) 



Since an overall multiplicative factor will cancel out in Eq.fTOl 
and not affect the calculation of the mean value, Eq.[8]is equiv- 
alent to the maximum likelihood estimation of dE/dx and is 
an optimal estimation. 



where <7j is the standard deviation of the statistical ensemble 
from which a% is taken, a is the mean of a and <7 a is the stan- 
dard deviation of a. In order to show that Eq.|8]is a maximum 
likelihood estimate of dE/dx we must show that the choice 
Wj(xi) = Hj(xi) meets the criteria set out in Eq.fTOl 

Starting with Eq.|7] the evaluation of dE/dx will involve a 
finite difference of the natural logarithm of Hj(xi), 



dx [ l) 



In (H j (x i+1 )±AH j (x i+1 )) 



(11) 



Ax i-v- 3 ^»+i;-""j» 
-ln{H j (x i )±AH j (x i ))] 

where AHj(Xi) represents the statistical uncertainty in 
Hj(xi), and Ax = Xi+i — xt. We can re-express Eq. [TT1 
as 



dx 



k B T 
Ax 



ln(Hj(x i+1 ) 1± 



Hj(x 



i+lj 



ln[Hj(xi)(l± 



AHj(xi) 



Ax 



1± 



fin 

In I 1 ± 



#7 fai) 

In^^+i))-^ (-H^a*)) 



H-j(x i+1 
AH,j{xi) 



(12) 



so that the uncertainties in Hj(xi) and Hj(xi+i) produce ad- 
ditive uncertainties in dEj/dx. Assuming the uncertainty in 
Hj(xi) is statistical, the uncertainty terms can be simplified 
using AHj(xi) = y/Hj(xi), so that 



In li 



AHjjxj) 
Hj(xi) 



In 1± 



Hj(xi) 



±1 



(13) 

where the last step is an expansion of the expression to first 
order. Since AHj(xi) and AHj(x xJrl ) are uncorrected, the 
errors arising from these terms add in quadrature. In the limit 
that Ax is small we can neglect the difference between Hj (xi) 



III. APPLICATION OF DESA TO EXPERIMENTAL DATA 

The weighted histogram analysis method (WHAM) pro- 
duces an optimal estimation of p, which is closely related to 
E, and differential energy surface analysis (DESA) produces 
an optimal estimation of dE/dx. If data is statistically well 
converged, and collected from a system that thoroughly sam- 
ples its equilibrium ensemble, both DESA and WHAM will 
converge to the same underlying energy landscape. However, 
in experiments and in simulations, it is often a challenge to 
obtain adequate statistical convergence or to verify that the 
equilibrium ensemble has been adequately sampled. It is in- 
teresting to consider the accuracy of DESA, and whether it 
produces results which are significantly different from those 
produced by WHAM in cases where convergence of statistics 
is less than ideal. 

We will apply DESA and WHAM to single molecule 
experiment in which a DNA hairpin is unfolded under a 
harmonic constraint applied by an optical trap. The hairpin 
has sequence CCGCGAGTTGATTCGCCATACACCTGC- 
TAATCCCGGTCGCTTTTGCGACCGGGATTAGCAGG- 
TGTATGGCGAATCAACTCGCGG, which folds into a 40 
base-pair stem with a 4-T loop. It is connected to the bound- 
ary of the sample chamber on one side and to a polystyrene 
micro-sphere on the other via biotin and digoxigenin tagged 
double-stranded DNA linkers. When the optical trap is held 
at constant position and intensity, the combination of the 
restoring force imposed on the micro-sphere by the optical 
trap and the elasticity of the handles produce a harmonic 
constraint acting on the hairpin with a m 200 pN//xm. 
The position of the sample chamber is controlled by a 
piezoelectric positioning stage with nanometer resolution, 
and the origin of the constraint is controlled by varying the 
position of the sample chamber with respect to the trap center. 
The optical trap measures the instantaneous position of the 
micro-sphere and instantaneous force applied to the tether as 
the constraint origin is swept. By determining the total length 
of the tether and subtracting off the instantaneous extension 
of the double-stranded DNA handles (estimated using a 
worm-like chain model of DNA elasticity) the extension 




FIG. 3. (a) Reconstructed dE/dx plot. In (b)-(f) the dEj/dx estimates from the 3rd, 6th, 9th, 12th and 15th intervals are compared with the 
reconstructed dE/dx function. The dEj/dx values are calculated using Eq. [7j For (a) the uncertainty is based on the maximum likelihood 
result given in Eq.[lO]for (b)-(f) uncertainty is based on Eg. 1 121 



of the hairpin itself is determined! 14]. The apparatus and 
experimental procedure has been described elsewhere^. 
The measured time series comprised of approximately 10 5 
samples is shown for unfolding of the hairpin in Fig. [TJa), 
and for folding of the hairpin in Fig.QJb). 

The record of extension vs. time of the experimental system 
is divided into 20 equal time intervals and the histogram of 
position is calculated for each interval. Calibration data is 
used to calculate the mean stiffness and origin of the constraint 
for each of the 20 intervals. Since the constraint origin moves 
continuously as data is acquired, it is not a constant within 
each interval. However, the deviation of the constraint origin 
from the mean value does not exceed ~ 1 nm in the course of 
an interval, which implies an error in the constraint correction 
of less than ^0.2 pN. Therefore analysis of each interval using 
the mean constraint origin is a negligible source of error. 

The biased energy surfaces obtained by applying Eq. [2] 
to the histograms compiled from two adjacent intervals in 
Fig. [TJa) are shown in Fig. [2] The underlying potential, due to 
the sequence dependence of the hairpin stem hybridization en- 
ergy, is the same for the two histograms shown, but the broad 
parabolic potential of the constraint shifts slightly as the con- 
straint origin is moved. The purpose of DESA (or WHAM) 
is to compensate for the effect of the harmonic constraint and 
extract the shape of the underlying energy surface. 

In Fig.|3]the averaged dE/dx function calculated from the 
data in Fig. [TJa) is shown. In panels (b) through (f), 5 repre- 
sentative dEj / dx functions (calculated by applying Eq. [TJ to 
individual histograms) are selected from the 20 functions used 
to compute the averaged function. Examining the dEj/dx 
plots, each covers a different domain of the extension and ex- 
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FIG. 4. The dE/dx curves obtained from unfolding and folding of 
the hairpin are compared. 



hibits low uncertainty near the center of the domain, where 
statistics are well converged, and high uncertainty near the 
margins of the domain. The individual dEj/dx functions are 
consistent with the average dE/dx within experimental un- 
certainty. The energy of the hairpin as a function of extension 
is not known a priori, but the individual dEj/dx curves are 
consistent with the average dE/dx function, and with each 
other. The fact that all 20 of the individual dEj/dx functions 
(a subset of which is shown in Fig. [3]) agree with the average 
function is evidence that the same underlying landscape is be- 
ing measured for all 20 constraint origins, and that the sys- 
tem has not jumped to a different branch of the energy land- 
scape as the constraint is moved. In Supplemental appendix 
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FIG. 5. (a) Comparison of the energy surface obtained by DESA 
and by WHAM from the time series in Fig. \V[a). (b) Comparison 
of dE/dx obtained by DESA and WHAM from the time series in 
Fig.Ea). 



Figs. S2 and S3, DESA is applied to a simulated system for 
which the energy surface is known, allowing direct confirma- 
tion that DESA produces the correct dE/dx function. 

In the umbrella sampling method, it is assumed that the bi- 
asing potential is time independent and that the system is in 
thermodynamic equilibrium as data is collected. In our exper- 
iment the constraint origin moves continuously as data is ac- 
quired and analyzing the data using the DESA or WHAM in- 
volves an approximation. The most convincing evidence that 
the system remains in effective equilibrium as the constraint 
origin moves is that identical energy landscapes are obtained 
for folding and unfolding of the hairpin, for which the con- 
straint origin moves in opposite directions. The energy land- 
scapes obtained from data in Figs. [If a) and[Tfb) are compared 
in Fig. HI No significant difference is found between land- 
scapes obtained for folding and unfolding of the hairpin. 

There are systems, such as pseudoknots, G-quadruplex 
DNA and others, which exhibit large irreversible steps when 
disrupted in single molecule experiments lfl5l [lcjll . In such 
cases, the techniques described here would not be suitable for 
reconstructing the global energy landscape. Nonequilibrium 
analysis methods have been developed which can determine 
the energy surface from data taken far from equilibrium! 17- 
l22ll . These methods typically require a great deal of experi- 
mental data, since they involve measuring the dependence of 



the disruption force on force loading rate or averaging many 
trajectories with weights determined by the external work per- 
formed. In cases where a molecule can be disrupted progres- 
sively in a reversible manner, we find it advantageous to em- 
ploy equilibrium methods, which requires only a single trajec- 
tory. 

In order to verify the DESA result, the data shown in Fig.Q] 
was also analyzed using WHAM, as defined by Eqs [5] and [6] 
In Fig. [5J a ) the energy surface obtained by WHAM is plotted 
along with the energy surface obtained from integration of the 
dE/dx curve shown in Fig. [3] The DESA and WHAM curves 
are indistinguishable. The overall slope of the energy land- 
scape is reproduced, as well as the ripples that arise from the 
sequence dependence of the DNA hybridization energy. The 
sequence dependence is more apparent in the plot of dE/dx, 
which is shown in Fig.|5jb). As in the case of the energy, re- 
sults obtained from DESA and WHAM are indistinguishable. 



IV. DIAGNOSTICS IN THE DESA METHOD 

Recent work has provided criteria for error estimation in 
free energy calculations based on the weighted histogram 
analysis method j23l l24ll . The DESA result is obtained by 
straightforward averaging of dEj / dx estimates obtained from 
different constraint origins. Use of the maximum likelihood 
estimation assures that the optimal value of dE/dx is pro- 
duced, and straightforward error propagation can be used to 
obtain the uncertainty in the values of dE/dx obtained. How- 
ever, when employing umbrella sampling, it is necessary to as- 
sume that the data acquired with different constraint origins is 
sampling the same energy surface. One potential pitfall of the 
DESA method is that we can obtain a smooth average dE/dx 
curve, even if the system jumps to a different energy surface 
as the constraint origin is moved. Here we introduce two cri- 
teria that can be applied in order to detect inconsistencies in 
dEj I dx values obtained from different constraint origins and 
we illustrate the results obtained when these criteria are ap- 
plied to experimental data. 

The first method involves comparison of the biased energy 
surfaces obtained from different constraint origins. Subtract- 
ing two biased energies, we obtain 



E h ,k(x) - E hJ (x) = 
Eh P {x) + —(x - x k f 



E hp (x) + -(x-Xjf 



x [a(xj - Xk)] + - (x 2 k - xf) 



(16) 



where E\ )j j(x) is the biased energy surface measured with 
constraint origin xj and E\ ip (x) is the intrinsic energy of the 
hairpin. The cancelation of E\ lp (x) leaves terms which de- 
pend only on the biasing potential. The constant term is not of 
interest, since the energy itself is only defined up to an additive 
constant. However, we expect the energy difference to man- 
ifest a straight line with slope determined by the constraint 
strength and the relative constraint displacement, a(xj — x k )- 
If a different effective energy surface is in effect after the con- 
straint origin is moved, E\ ip will fail to cancel in Eq. [6] and 
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FIG. 6. The difference in biased energy for adjacent intervals. Data 
is as in Fig.|2]except the time series is divided into six intervals rather 
than twenty. The differences Ez(x) — E^ix), Ea{x) — Ez(x) and 
Ec,(x) — E4(x) are shown. 



anomalous features will appear in the difference curve. 

Applying this criteria to the single molecule experiment, we 
recompiled the histograms from the data shown in Fig. [TJa) 
using 6 intervals rather than 20 (to improve statistical conver- 
gence) and subtracted the biased energy surfaces for adjacent 
intervals. The result is shown in Fig. [6] The fact that straight 
lines with the expected slope are found are strong evidence 
that the Eh p effective at different constraint origins have can- 
celed, as expected, and that the dE / dx values being averaged 
are mutually consistent. 

We can also test the self-consistency of the DESA analy- 
sis by determining if dEj/dx values obtained from individual 
constraint origins deviate from the mean value in a manner 
that is consistent with their statistical uncertainty. When con- 
sidering Fig.[3]above, it was observed that individual dEj/dx 
curves appear to be mutually consistent and values typically 
fall within a standard deviation of the mean value. This crite- 
ria can be expressed more rigorously using a simple reduced 



X 2 test. For each histogram bin i, we evaluate 

( ^El — dE\ 2 
o 1 x — > \ dx dx ) 

* =— ^ a] ' (17) 

where Oj is the uncertainty in the value of obtained from 
the j th constraint position. If the deviation of the individ- 
ual values of dEj/dx are consistent with the statistical uncer- 
tainty, the value of x 2 should be of order 1 . A value signifi- 
cantly larger than 1 indicates that systematic errors are present 
in the dEj/dx values. 

In Fig.|7]the value of the reduced x 2 is plotted as a function 
of extension for the experimental data set. For the central por- 
tion of the extension range, the x 2 values are of order unity, in- 
dicating that the DESA method is working as expected. How- 
ever, for the extremes of extension, the x 2 values are higher, 
which would indicate that contributions from different con- 
straint origins are not consistent. This can be attributed to the 
fact measurement uncertainty becomes more significant when 
the hairpin is fully extended or fully folded. 

When the constraint origin is positioned to stabilize the 
hairpin in the fully open or fully closed conformation, the 
conformational dynamics of the hairpin itself are minimal and 
the fluctuations in the measured extension are largely due to 
thermal fluctuations in the handles. At small extension, the 
problem is compounded by the fact that the average force is 
low, resulting in lower stiffness of the handles and increased 
fluctuations. These measurement errors blur the sharp cut- 
off that would otherwise appear in the probability density as 
the hairpin reaches the fully-open or fully-folded state, and 
similarly blur the energy function. The x 2 function alerts us 
to the fact that the energy surface is accurately measured in 
the central part of the unfolding range, but that if affected by 
systematic errors near the fully folded or fully unfolded state. 
(For comparison, the x 2 test is applied to simulated data in 
Supplemental appendix Fig. S4.) This illustrates how the x 2 
criteria can be an effective way of evaluating the quality of the 
DESA determination of dE/dx as a function of the reaction 
coordinate. 
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FIG. 7. The value of x 2 as a function of extension, for the data shown 
in Fig. (Ha). 



V. CONCLUSIONS 

We have described the differential energy landscape anal- 
ysis method, and shown that it is an optimal estimation of 
dE/dx. This is complementary to the weighted histogram 
method, which gives an optimal estimation of p. We have 
demonstrated the application of the DESA technique to sin- 
gle molecule experimental data and have shown that DESA 
and WHAM produce indistinguishable results when applied 
to the same data. DESA is defined by a formula that can be 
written in closed form (Eq. [8]) and is computationally sim- 
pler than WHAM, which requires the self-consistent solu- 
tion of a system of nonlinear equations (Eq. [6j. To apply 
DESA we must assume that the system is sampling the same 
branch of the folding energy surface as the constraint origin 
is moved. We have described two criteria for determining 
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whether the dE/dx data obtained by application of DESA 
are self-consistent, which can be used to validate this assump- 
tion. Our experimental system is formally out of equilibrium, 
since the constraint origin is moved continuously as data is ac- 
quired. However, equilibrium techniques provide an accurate 
measurement of the landscape in this case because the rate at 



which the constraint is moved is slow compared with the rate 
at which the system equilibrates. This is attributable to the fact 
that no high energy barriers are present in the biased energy 
landscape as the constraint origin is swept. 

This work was supported by the Maryland Technology De- 
velopment Corporation. 
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In the body of the paper we apply DESA and WHAM a data set collected in an 
single molecule experiment. For comparison, we present a Langevin simulation of a 
strongly damped particle moving in a one dimensional potential specified by a forth 
order polynomial. The polynomial was chosen to be qualitatively similar to the folding 
energy of the hairpin. In the main body of the paper DESA is applied to an experimental 
system which includes such practical challenges as calibration uncertainty, instrumen- 
tal noise and the finite time resolution of the apparatus. The simulated system is less 
complex, but has the advantage that the results obtained from DESA can be compared 
with the known energy surface. 

The Langevin simulation uses effective energy E(x) = a(x — xo) 4 + b(x — xo) 2 + 
c(x - x ), where x = 0.025 jum, a = 1.024 x 10 6 pN/^m 3 b = 320 pN/^m, and 
c = 2.5 pN. This landscape is plotted in Fig. IS II and has one stable state and one 
metastable state separated from the stable state by a shallow barrier. If this system 
is simulated directly it will remain in the stable 'closed' state. In analogy with the 
experimental system, a harmonic constraint with stiffness 500 pN//im is imposed on the 
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Figure S 1 : The energy function used in the Langevin simulation. 
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Figure S2: The coordinate of the simulated system as a function of time as the con- 
straint origin is moved from at 17.5 nm/s. 

system. In order to match the apparent time scale of the simulation to the experimental 
data, the effective drag on the particle chosen to be 0.05 pN-s//zm, and 400 substeps of 
5 x 10~ 8 s were taken between data samples using k-&T = 4.114 x 10~ 3 pN • /an. 
The constraint origin was swept from 0.0125 /im to 0.0475 /im during the course of 2 
seconds, producing a time series of 10 5 samples. The measured time series is shown in 

Rg. ED 

In Fig. [S3] the averaged dE/dx function, calculated from the simulated data in 
Fig.[S2] is shown and compared with dE/dx of the theoretical energy function. Agree- 
ment between the DESA result and the theoretical curve is consistent with the statis- 
tical uncertainty in the averaged function. In panels (b) through (f), 5 representative 
dEj/dx functions are selected from the 20 functions used to compute the averaged 
function. Examining the individual dEj jdx plots, each covers a different domain of 
the extension and exhibits low uncertainty near the center of the domain, where statis- 
tics are well converged, and high uncertainty near the margins of the domain. The 
individual dEj/dx functions are consistent with the theoretical dE/dx within exper- 
imental uncertainty. A similar relationship between the individual dEj / dx and the 
average dE/dx is observed in the experimental data shown in the main body of the 
paper. The fact that the energy in the simulation is known a priori allows a rigorous 
evaluation of the results of the DESA method. 

The x 2 measure defined in the main manuscript is shown for the experimental and 
simulated system in Fig. |S4] The simulated system gives x 2 of order 1, as expected. 
The experimental system behaves in a similar way for the central portion of the reaction 
coordinate, but exhibits systematic error for large and small extension, as discussed in 
the main text. 
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Figure S3: (a) Reconstructed dE/dx plot of the simulated system in Fig.[S2] In (b)-(f) 
the dEj/dx estimate from the 3rd, 6th, 9th, 12th and 15th interval are compared with 
the reconstructed dE/dx function. 
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Figure S4: The value of \ 2 as a function of extension, for the experimental system 
considered in the body of the paper and for the simulation. 
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